Experimental research on behavior and cognition frequently rests on stimulusor subject selection where not all characteristics can be fully controlled,even when attempting strict matching. For example, when contrasting patients tocontrols, variables such as intelligence or socioeconomic status are oftencorrelated with patient status. Similarly, when presenting word stimuli,variables such as word frequency are often correlated with primary variables ofinterest. One procedure very commonly employed to control for such nuisanceeffects is conducting inferential tests on confounding stimulus or subjectcharacteristics. For example, if word length is not significantly different fortwo stimulus sets, they are considered as matched for word length. Such a testhas high error rates and is conceptually misguided. It reflects a commonmisunderstanding of statistical tests: interpreting significance not to referto inference about a particular population parameter, but about 1. the samplein question, 2. the practical relevance of a sample difference (so that anonsignificant test is taken to indicate evidence for the absence of relevantdifferences). We show inferential testing for assessing nuisance effects to beinappropriate both pragmatically and philosophically, present a survey showingits high prevalence, and briefly discuss an alternative in the form ofregression including nuisance variables.
展开▼